{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Model Repository API\n",
    "\n",
    "MLServer supports loading and unloading models dynamically from a models repository.\n",
    "This allows you to enable and disable the models accessible by MLServer on demand.\n",
    "This extension builds on top of the support for [Multi-Model Serving](../mms/README.md), letting you change at runtime which models is MLServer currently serving.\n",
    "\n",
    "The API to manage the model repository is modelled after [Triton's Model Repository extension](https://github.com/triton-inference-server/server/blob/master/docs/protocol/extension_model_repository.md) to the V2 Dataplane and is thus fully compatible with it.\n",
    "\n",
    "This notebook will walk you through an example using the Model Repository API.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Training\n",
    "\n",
    "First of all, we will need to train some models.\n",
    "For that, we will re-use the models we trained previously in the [Multi-Model Serving example](../mms/README.md).\n",
    "You can check the details on how they are trained following that notebook."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!cp -r ../mms/models/* ./models"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Serving\n",
    "\n",
    "Next up, we will start our `mlserver` inference server.\n",
    "Note that, by default, this will **load all our models**.\n",
    "\n",
    "```shell\n",
    "mlserver start .\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## List available models\n",
    "\n",
    "Now that we've got our inference server up and running, and serving 2 different models, we can start using the Model Repository API.\n",
    "To get us started, we will first list all available models in the repository."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import requests\n",
    "\n",
    "response = requests.post(\"http://localhost:8080/v2/repository/index\", json={})\n",
    "response.json()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we can, the repository lists 2 models (i.e. `mushroom-xgboost` and `mnist-svm`).\n",
    "Note that the state for both is set to `READY`.\n",
    "This means that both models are loaded, and thus ready for inference."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Unloading our `mushroom-xgboost` model\n",
    "\n",
    "We will now try to unload one of the 2 models, `mushroom-xgboost`.\n",
    "This will unload the model from the inference server but will keep it available on our model repository."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "requests.post(\"http://localhost:8080/v2/repository/models/mushroom-xgboost/unload\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If we now try to list the models available in our repository, we will see that the `mushroom-xgboost` model is flagged as `UNAVAILABLE`.\n",
    "This means that it's present in the repository but it's not loaded for inference."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "response = requests.post(\"http://localhost:8080/v2/repository/index\", json={})\n",
    "response.json()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Loading our `mushroom-xgboost` model back\n",
    "\n",
    "We will now load our model back into our inference server."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "requests.post(\"http://localhost:8080/v2/repository/models/mushroom-xgboost/load\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If we now try to list the models again, we will see that our `mushroom-xgboost` is back again, ready for inference."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "response = requests.post(\"http://localhost:8080/v2/repository/index\", json={})\n",
    "response.json()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
